Semantic Based Text Block Segmentation Using WordNet
نویسندگان
چکیده
منابع مشابه
Text block segmentation using pyramid structure
Text block segmentation is necessary in document layout analysis. An algorithm and its implementation that segregates text block by block (a block is either a title or a paragraph) from the provided document, e.g. newspaper image, based on pyramid structure is described in this paper. The pyramid structure, which is amenable for parallel processing on output, is a multi-resolution image represe...
متن کاملOpen Text Semantic Parsing Using FrameNet and WordNet
This paper describes a rule-based semantic parser that relies on a frame dataset (FrameNet), and a semantic network (WordNet), to identify semantic relations between words in open text, as well as shallow semantic features associated with concepts in the text. Parsing semantic structures allows semantic units and constituents to be accessed and processed in a more meaningful way than syntactic ...
متن کاملEfficient Hybrid Semantic Text Similarity using Wordnet and a Corpus
Text similarity plays an important role in natural language processing tasks such as answering questions and summarizing text. At present, state-of-the-art text similarity algorithms rely on inefficient word pairings and/or knowledge derived from large corpora such as Wikipedia. This article evaluates previous word similarity measures on benchmark datasets and then uses a hybrid word similarity...
متن کاملText Segmentation based on Semantic Word Embeddings
We explore the use of semantic word embeddings [14, 16, 12] in text segmentation algorithms, including the C99 segmentation algorithm [3, 4] and new algorithms inspired by the distributed word vector representation. By developing a general framework for discussing a class of segmentation objectives, we study the effectiveness of greedy versus exact optimization approaches and suggest a new iter...
متن کاملUnsupervised Text Segmentation Using Semantic Relatedness Graphs
Segmenting text into semantically coherent fragments improves readability of text and facilitates tasks like text summarization and passage retrieval. In this paper, we present a novel unsupervised algorithm for linear text segmentation (TS) that exploits word embeddings and a measure of semantic relatedness of short texts to construct a semantic relatedness graph of the document. Semantically ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer and Communication Engineering
سال: 2013
ISSN: 2010-3743
DOI: 10.7763/ijcce.2013.v2.257